Messenger API Design Evaluation and Latency Budget
Understand how we achieve the non-functional requirements for the Messenger API.
Introduction#
Modeling an adequate API is a complex task that involves fine-tuning different technical dimensions. These technical dimensions have different parameters to be optimized, and their tradeoffs must be considered. In this lesson, we’ll discuss how non-functional requirements can be achieved and what optimization decisions we need to take. We’ll also discuss the latency and response time of our proposed API of Messenger.
Non-functional requirements#
The following section discusses how the messenger API meets the non-functional requirements:
Consistency#
A chat application would require rolling out new features frequently, which needs periodic API versioning. We keep the endpoints, error messages, URL patterns, and relevant data entities uniform to achieve consistency. Moreover, during a chat, the messages must be delivered in sequence; therefore, we utilize the FIFO (First In, First Out) messaging queue with strict ordering.
Point to Ponder
Question
In the context of consistency, how do we make sure that all group participants see the messages in the same order?
To ensure consistency in message ordering in a group chat, we should consider the following approaches collectively:
- Use a single database: Store all messages in a single database to ensure that all participants access the exact source of data, which helps maintain consistency. There will be redundant databases to ensure the availability and reliability of the system.
- Unique sequence number: Assigning a unique sequence number to each message can help ensure that messages are delivered and displayed in the correct order. This can be achieved through a sequencer. Some sequencers enable us to infer causality or have wall-clock time as part of their sequence, which makes it possible to monotonically order messages. See sequencer design for details.
- Messaging queue: Utilizing a messaging queue ensures that messages are processed in the correct order. It also helps to prevent messages from being lost or duplicated.
Availability and disaster recovery#
To enhance availability, we divide different responsibilities among various services and install redundant servers to avoid overburdening a single service. Furthermore, cascading failures are avoided using circuit breakers. We also provide sufficient chat servers and corresponding WebSocket managers to handle their mapping with clients. Therefore, even if a WebSocket connection fails with one chat server, the session is recreated, possibly with a different server. Moreover, the messages are stored on highly consistent and efficient database clusters, like HBase and MyRocks, which provides high availability and reliability via region replication.
Note: HBase is an open-source distributed key-value store based on HDFS. It is known for its consistent read and write operations, scalability, and support for MapReduce jobs. MyRocks is an open-source database introduced by Facebook which integrates RocksDB as a MySQL storage engine.
Security#
For authentication and authorization, a user can log in to Messenger using the username and password they chose during the signup phase. Based on these credentials, a JWT token is generated, which is used for the duration of that session only. Apart from authentication and authorization, we provide an end-to-end encryption mechanism to secure communication. This secure communication uses secret keys (symmetric encryption) that are exchanged using asymmetric cryptography primarily based on the Signal protocol.
Low latency#
It is imperative to achieve low latency for a chatting application. That is why, for real-time chat, we choose a WebSocket connection, even if it consumes higher resources than a stateless HTTP connection. However, using WebSocket for sending media files may cause an increase in latency due to its lack of a multiplexing feature. To mitigate this issue, we upload the media files via a separate HTTP connection. Also, because it's possible that viral media files are frequently shared among users, we avoid storing the same content multiple times using hashing. Furthermore, we employ read-through caching for storing the messages, which enhances the performance of our API since it allows read and write operations on the database via caching.
Achieving Non-Functional Requirements
Non-Functional Requirements | Approaches |
Consistency |
|
Availability |
|
Security |
|
Low latency |
|
Latency budget of the Messenger API#
We are utilizing two application layer protocols—HTTP and WebSocket—in the Messenger API. Therefore, we have two operations that impact the response time of our API. One of these operations is uploading the media file via HTTP, and the other is real-time chat via WebSocket. We have discussed the response time of the file API operation while designing its respective API. In this section, we focus on the response time of the one-to-one communication between two clients.
Assumptions: We make the following assumptions before estimating the latency:
The text message sizes are usually smaller than 1 KB, which means that we can roughly use the same time as RTT that we assumed in the back-of-the-envelope latency calculations.
We don't consider the latency of media files, because they are sent and received through HTTP instead of WebSockets. For media files, we can refer to the estimates of file upload API.
The WebSocket connection is already established, which takes a maximum of 275.9 ms, and the receiver is online.
Let's calculate the end-to-end delivery time of our API in the following steps:
The sender sends the message to its corresponding chat server in 35 ms. This is half the RTT we estimated in the back-of-the-envelope calculation.
The chat server processes the message in 0.125 ms to decide whom the message is intended for.
The message is delivered to the messaging queue in 10 ms, assuming that messaging queue resides in another zone.
The messaging queue takes around 0.125 ms to process the message, place it in the queue, and decide which server the message needs to be forwarded to.
The message is delivered to the chat server to which the receiver is connected in 10 ms.
The chat server of the receiver processes the message and forwards it to the receiver in 0.125 ms.
The message is delivered to the receiver from the chat server in 35 ms.
The receiver sends the acknowledgment to the chat server, which is forwarded to all other components involved in the communication.
So the process of delivering a message from the sender to the receiver takes approximately 180.25 ms, as shown in the following slides.
1 of 11
2 of 11
3 of 11
4 of 11
5 of 11
6 of 11
7 of 11
8 of 11
9 of 11
10 of 11
11 of 11
Assuming that the initial connection takes
It is important to realize that the equation above that
In this lesson, we described how we achieve the nonfunctional requirements of our API for a chat application like Messenger. At the end of the lesson, we also estimated the time it takes for a message to be delivered to the receiver and get an acknowledgment back.
API Model for Messenger Service
Requirements of the Google Maps API